Manuscript title: Evolutionary and ecological processes underlying geographic variation in innate bird songs 
Marcos Maldonado-Coelho1,2, Sidnei S. dos Santos3, Morton L. Isler4, Maria Svensson-Coelho5, Manuelita Sotello-Munoz2, Cristina Y. Miyaki2, Robert E. Ricklefs1 and John G. Blake6.
1. Department of Biology, University of Missouri-St. Louis, R223 Research Building, One University Boulevard, St. Louis, Missouri, 63121-4499, USA
2. Departamento de Gen�tica e Biologia Evolutiva, Instituto de Bioci�ncias, Rua do Mat�o 277, Cidade Universit�ria, S�o Paulo, S�o Paulo, 05508-900, Brazil
3. Instituto de Biologia, Departamento de Zoologia, Universidade Federal da Bahia, Rua Bar�o de Geremoabo, 147, Ondina, Salvador, Bahia, 40170115, Brazil.
4. Smithsonian Institution, National Museum of Natural History, Division of Birds, PO Box 37012, MRC 116, Washington, DC, 20013-7012, USA
5. Department of Biology, Lund University, S�lvegatan 37, 223 62 Lund, Sweden
6.  Department of Wildlife Ecology and Conservation, 110 Newins-Ziegler Hall, University of Florida, Gainesville, Florida, 32611-0430, USA

Corresponding author: M. Maldonado-Coelho maldonadocoelhom@gmail.com


Summary
This study aims to investigate the evolutionary and ecological factors underlying geographic variation in songs of two fire-eye antbird species (Genus Pyriglena) along the South American Atlantic Forest. We combine in an integrative approach vocal, genetic, morphological and environmental variation. 

M. Maldonado-Coelho, Sidnei S. dos Santos, M. Sotello-Munoz and M. Svensson-Coelho are responsible for collecting data.  M. Maldonado-Coelho is responsible for writing code.

#########################################################################################
R Packages versions used in analyses
R studio v. 1.4.1106
gratia v.0.6.0 # Author: Gavin L. Simpson (2021)
mgcv v.1.8-35 # Author: Simon Wood (2017)
ggplot2 v.3.3.3 # Author: Hadley Wickham et al. (2020)
scam v.1.2-11 # Author: Natalya Pya (2021)
cowplot v.1.1.1 # Author: Claus O. Wilke (2020)
tidyr v.1.1.3 # Author: Hadley Wickham et al. (2021)
visreg v.2.7.0 # Author: Patrick Breheny and Woodrow Burchett (2020)
lmerTest v.3.1-3 # Author: Alexandra Kuznetsova et al. (2020)
nlme v.3.1-152 # Author: Jos� Pinheiro et al. (2021)    # gamm for lme() in nlme package
emmeans v.1.6.2-1 # Author: Paul Buerkner et al. (2021)
MuMIn v.1.43.17 # Author: Kamil Barton (2020)
itsadug v.2.4 #Author: Jacolien van Rij et al. (2020)
MASS v.7.3-54 #Author: Brian Ripley et al (2021)
biotools v.4.2 # Author: Anderson Rodrigo da Silva (2021)
caret v.6.0-88 # Author: Max Kuhn et al. (2021)
e1071 v.1.7-7 # Author: David Meyer et al. (2021)
vcd v.1.4-9 # Author: David Meyer (2021)
RVAideMemoire v.0.9-80 # Author: Maxime Herv� (2021)
dplyr v.1.0.6 # Author: Hadley Wickham et al. (2021)
car v.3.0-10 # Author: John Fox et al. (2020)
caret v.6.0-88 # Author: Max Kuhn et al. (2021)
plyr v.1.8.6 # Author: Hadley Wickham (2020)
rgdal v.1.2-27 # Author: Roger Bivand et al. (2021)
magrittr v.2.0.1 # Author: Stefan Bache et al. (2020)
gstat v.2.0-8 # Author: Edzer Pebesma and Benedikt Graeler (2021)
vegan v.2.5-7 # Author: Jari Oksanen et al. (2020)
geosphere v.1.5-10 # Author: Robert J. Hijmans et al. (2019)
ade4 v.1.7-17 # Author: St�phane Dray et al. (2021) 
tseries v.0.10-48 # Author: Adrian Trapletti et al. (2020)

#Installation instructions for mgcv.helper from Github (this package is not available on CRAN)
#https://github.com/samclifford/mgcv.helper
library(devtools)
devtools::install_github("samclifford/mgcv.helper")
mgcv.helper v.0.1.9 # Author: Sam Clifford (2021)

#########################################################################################

Folder: Genetic Sequences
This folder contains all sequences of the seven genetic markers used in this study. 
File �samples and localities.csv�: this file contains the information about the geographic origin of samples in provided. Detailed description of samples, primers used, amplification, sequencing, and analyses are provided in Sotelo-Mu�oz, M., M.�Maldonado-Coelho, M.�Svensson-Coelho, S. S.�dos Santos,�and�C. Y. Miyaki 2020. Vicariance, dispersal, extinction and hybridization underlie the evolutionary history of Atlantic Forest fire-eye antbirds (Aves: Thamnophilidae). Mol. Phylogenet.�Evol. 148: 106820
Variable list:
No. Sample: number identifying the individual genetic samples
Species: species identification
Locality: sampling locality
Country: country name where samples were obtained


File �BRM15.fasta�: file containing sequences for the intron 15 of the brahma�protein gene (BRM15, 349 bp, Z-chromosome linked locus)
File �CHDZ.fasta�: file containing sequences for the intron 18 of the chromo helicase DNA binding protein gene (CHDZ-18, 324 bp, Z-chromosome linked locus)
File �PLAA.fasta�: file containing sequences for the intron 1 of the phospholipase�A2 gene (PLAA1, 660 bp, Z-chromosome linked locus)
File �55J7.fasta�: file containing sequences for the autosomal anonymous locus 55J7 (298 bp)
File �GK439.fasta�: file containing sequences for the autosomal anonymous locus GK439 (330 bp)
File �VIDY.fasta�: file containing sequences for the autosomal anonymous locus VIDY (352 bp)
File �ND2.fasta�: file containing sequences for the mitochondrial (mtDNA) NADH dehydrogenase subunit II (ND2; 996 bp)

Folder: R code
This folder contains a file with the R script 
File: �Code_Maldonado-Coelho_et_al_Am_Nat_2022.R�: this file contains the annotated code used in analyses and figures

Folder: Songs_Samples
This folder contains a sample of songs used in the analyses. Files names provide the individual identification number and sampling locality. Information on identification numbers and sampling localities are provided in the files "maleleucopteraatra_means.csv" and �TableS1.csv� described below.


Folder: Data files
This folder contains all data files used in the analyses

File
�TableS1.csv�: This file contains, for each of the two fire-eye species (Pyriglena atra and P. leucoptera), the names of sampling localities (column Locality), the geographic coordinates for each sampling locality (columns Lat. and Long.) and the number of male songs recorded in each locality (column Male songs). 

File
�Morphomterics_wing_tarsus_bill_Pyriglena.csv�: this file contains raw morphometric measurements of the southern fire-eye (Pyriglena leucoptera). 
Variable list:
The columns represent sampling locality names, including field work and museum specimens. The column ID number indicates the museum catalogue number, field sampling number or the genetic collection number. The morphometric variables include Tarsus length (unit: millimeters), Wing Length (unit: millimeters), Bill Length (unit: millimeters), PCA body size axis 1 for Tarsus and Wing (PCwingtarsus) and z-transformed values for each of these variables. 

File
�Male_pyriglena_songs_raw_data.csv�: this file contains raw individual measurements of the five song notes for each sampling locality. 
Variable list:
These include number of notes (PaceNotes), note length (NoteLen1-5 of each individual, unit: milliseconds), note maximum frequency (HiFreq1-5 of each individual, unit: hertz), note interval (SpaceLen1-5 of each individual, unit: milliseconds) as well as the individual mean of each of last three variables (e.g., NoteLenMean). 


File
"maleleucopteraatra_means.csv": this file contains the data set that was used to describe the general patterns of geographic variation in songs. 
Variable list:
TaxonName: the two fire-eye species studied
CutNumber: song recording identification number 
SiteName: sampling locality
Lat: Latitude 
Long: Longitude 
PaceNotes: number of notes in each individual song
NoteLenMean: within-individual mean note length (unit: milliseconds)
HiFreqMean: within-individual mean maximum note frequency (unit: hertz) 
SpaceLenMean: within-individual mean note interval (unit: milliseconds)
PC1: individual vocal PC1, the first axis of song PCA
PC2: individual vocal PC2, the second axis of song PCA
Latsouth: used to model the effect of absolute latitude on song variables. 
Numberac: used in the spatial models to single out each sampling locality. 

File
"maleleucoptera_indivfinal_2022.csv": This data set was used in the Generalized Additive Models (GAMs) and Multiple Matrix Regression with Randomization (MMRRs). 
Variable list:
TaxonName: Fire-eye species 
CutNumber: the individual recording number
LGEMA number: the catalogue number of the genetic collection at the University of S�o Paulo
SiteName: sampling locality name
Lat: latitude of each sampling site
Long: longitude of each sampling site
LatS: latitude south of each sampling site 
Billlength: mean bill length (original unit: millimeters) for each sampling site
Pcwingtarsus: mean PCA axis 1 for body size (combining wing and tarsus length) for each sampling site
Q_atra_k4: individual coefficient of admixture generated by STRUCTURE (k=4)
Q_atra_k4log: natural log of the individual coefficient of admixture 
Q_atra_k4zscores: z-scores of the natural log of the individual coefficient of admixture
Q_atra_k4logzscores_positive: positive values of the z-scores of the natural log of the individual coefficient of admixture. Used in a quasipoisson distribution family model. 
GeoDistanceHZ: geographic distance of each sampling site to the hybrid zone
PleucopteraposterprobR: posterior probability (from a DFA) of individual songs of the southern fire-eye species to resemble allopatric populations of this species. 
PaceNotes: number of notes in each individual song
NoteLen: within-individual mean note length (unit: milliseconds)
HiFreq: within-individual mean maximum note frequency (unit: hertz) 
SpaceLen: within-individual mean note interval (unit: milliseconds)
PaceNoteszscores: z-scores of individual number of notes in each song
PaceNoteszscorespositive: positive values of z-scores of number of notes in each individual song
NoteLenzscores: z-scores of the within-individual mean note length (original unit: milliseconds)
HiFreqzscores: z-scores of the within-individual mean maximum note frequency (original unit: hertz) 
SpaceLenzscores: z-scores of the within-individual mean note interval (original unit: milliseconds)
PC1: individual vocal PC1, the first axis of song PCA
PC2: individual vocal PC2, the second axis of song PCA
PC1zscores: z-scores of the individual vocal PC1
PC2zscores: z-scores of the individual vocal PC2
GeodistanceNorthleucoptera: geographic distance of each sampling site to the northernmost P. leucoptera sampling site
Temperature: temperature of each sampling site (unit: Degree Celsius)
NDVI: Normalized Difference Vegetation Index (forest cover) of each sampling site
Latsouthcor: latitude values of each sampling site for spatial models 
Longsouthcor: longitude values of each sampling site for spatial models
NDVIzscores: z-scores of Normalized Difference Vegetation Index (forest cover) of each sampling site
Billlengthzscores: z-scores of mean bill length (original unit: millimeter)
Pcwingtarsuszscores: z-scores of mean PCA axis 1 for body size (combining wing and tarsus length)
Q_atra_k4logzscores: z-scores of the natural log of the individual coefficient of admixture
Temperaturezscores: z-scores of temperature of each sampling site (unit: Celsius degree)
LatSzscores: z-scores of latitude for each sampling site
LatSzscoresinv: inverse of latitude values for each sampling site, for easier interpretation of GAM coefficients 

File
"maleleucopteraatra_meansforDFA-allinfo.csv": the file contains variables for Figure 5A and figure S17
Variable list:
DistrefHZ: geographic distance of each sampling locality to the hybrid zone
PleucopteraposterprobR: posterior probability (from a Discriminant Function Analysis) that individual songs resemble the allopatric southern fire-eye
NDVI: Normalized Difference Vegetation Index (forest cover) for each sampling site

File
"PosteriorprobvsQ_atra_leucop_November2021.csv": the file contains variables for Figure 5B
Variable list:
Q_atra_k4: coefficient of individual admixture values generated by STRUCTURE (k=4)
Q_atra_k4log: natural log of the coefficient of individual admixture values generated by STRUCTURE (k=4)
PleucopteraposterprobR: posterior probability (from a Discriminant Function Analysis) that individual songs resemble the allopatric southern fire-eye

File
"Freq_variance_males.csv": the file contains variables for Figure 5D
Variable list:
HiFreqMean.var.original: variance of maximum note frequency of each sampling site 
HiFreqMean.var: variance of maximum note frequency /1000 of each sampling site 

File
"Q_values_allindiv_vsGeodistHZ_December2021.csv": this file contains variables for Figure S2 
Variable list:
Genetic_ID_Number: individual genetic number
Qatra_K4: individual coefficient of admixture values generated by STRUCTURE (k=4)
DistHZ: geographic distance of each sampling site to the hybrid zone

File
"lda_scores&posterior_indivDFA-allopatric_only_March2022.csv": this file contains variables for Figure S4
Variable list:
Species: species assigned by the discriminant function
posterior.Pyriglena.atra: posterior probability assignment of an individual song as the northern fire-eye (Pyriglena atra)
posterior.Pyriglena.leucoptera: posterior probability assignment of an individual song as the southern fire-eye (Pyriglena leucoptera)
LD1: individual scores of discriminant function axis 1

File
"variance_allosvscontact_boxplots_2022.csv": this file contains variables for Figure S5
Variable list:
Group: group in which individuals were pooled to compare the vocal variance; atraallo- P. atra allopatric; atracontact- P. atra contact zone; leucontact- P. leucoptera contact zone; leucoallo- P. leucoptera allopatric
PaceNotes: number of notes in each individual song
NoteLenMean: within-individual mean note length (unit: milliseconds)
HiFreqMean: within-individual mean maximum note frequency (unit: hertz)
SpaceLenMean: within-individual mean note interval (unit: milliseconds)
PC1: individual vocal PC1, the first axis of song PCA
PC2: individual vocal PC2, the second axis of song PCA

File
"allrange_DFA-Zscores_2022.csv": this file contains data for a Discriminant Function Analysis for all songs across the range of the two fire-eye species
Variable list:
Species: species identification based on plumage
ZPaceNotes: z-scores of number of notes in each individual song
ZNoteLenMean: z-scores of the within-individual mean note length (original units: milliseconds)
ZHiFreqMean:  z-scores of the within-individual mean maximum note frequency (original units: hertz)
ZSpaceLenMean: z-scores of the within-individual mean note interval (original units: milliseconds)

File
"DFA_Zscores_allopatric&contactzone.csv": this file contains data for a Discriminant Function Analysis to compare songs in allopatry and in the contact zone
Variable list:
Species: species identification 
ZPaceNotes: z-scores of number of notes in each individual song
ZNoteLenMean: z-scores of the within-individual mean note length (original units: milliseconds)
ZHiFreqMean: z-scores of the within-individual mean maximum note frequency (original units: hertz)
ZSpaceLenMean: z-scores of the within-individual mean note interval (original units: milliseconds)

File
"DFA_Zscores_allopatric-only-March_2022.csv": file contains data for a Discriminant Function Analysis for only allopatric songs of the two fire-eye species
Variable list:
Species: species identification based on plumage
ZPaceNotes: z-scores of number of notes in each individual song
ZNoteLenMean: z-scores of the within-individual mean note length (original units: milliseconds)
ZHiFreqMean: z-scores of the within-individual mean maximum note frequency (original units: hertz)
ZSpaceLenMean: z-scores of the within-individual mean note interval (original units: milliseconds)

File
"maleleucoptera_Fratio_allopat_2022.csv": this file contains the data to estimate song variance of the southern fire-eye species (Pyriglena leucoptera) for allopatric sites
Variable list:
TaxonName: species identification	
CutNumber: identification of each individual recorded song	
SiteName: sampling site	
Latsouth: latitude south for each sampling site 	
PaceNotes: number of notes in each individual song	
NoteLenMean: within-individual mean note length (milliseconds)	
HiFreqMean: within-individual mean maximum note frequency (hertz)	
SpaceLenMean: within-individual mean note interval (milliseconds)	
PC1: individual vocal PC1, the first axis of song PCA
PC2: individual vocal PC2, the second axis of song PCA

File
"maleleucoptera_Fratio_contactzone_2022.csv": this file contains the data to estimate song variance of the southern fire-eye species (Pyriglena leucoptera) for contact zone sites
Variable list:
TaxonName: species identification	
CutNumber: individual identification of each recorded song	
SiteName: sampling site	
Latsouth: latitude south for each sampling site 	
PaceNotes: number of notes in each individual song	
NoteLenMean: within-individual mean note length (unit: milliseconds)	
HiFreqMean: within-individual mean maximum note frequency (unit: hertz)	
SpaceLenMean: within-individual mean note interval (unit: milliseconds)	
PC1: individual vocal PC1, the first axis of song PCA
PC2: individual vocal PC2, the second axis of song PCA

File
"maleatra_testeofvariance_final_march2022_allop.csv": this file contains the data to estimate song variance of the northern fire-eye species (Pyriglena atra) for allopatric sites
Variable list:
TaxonName: species identification	
CutNumber: identification of each individual recorded song	
SiteName: sampling site	
Latsouth: latitude south of each sampling site	
PaceNotes: number of notes in each individual song	
NoteLenMean: within-individual mean note length (unit: milliseconds)	
HiFreqMean: within-individual mean maximum note frequency (unit: hertz)
SpaceLenMean: within-individual mean note interval (unit: milliseconds)
PC1: individual vocal PC1, the first axis of song PCA
PC2: individual vocal PC2, the second axis of song PCA

File
"maleatra_testeofvariance_final_march2022_contactzone.csv": this file contains the data to estimate song variance of the northern fire-eye species (Pyriglena atra) for contact zone sampling sites
Variable list:
TaxonName: species identification	
CutNumber: individual identification of each recorded song	
SiteName: sampling site	
Latsouth: Latitude south of each sampling site	
PaceNotes: number of notes in each individual song	
NoteLenMean: within-individual mean note length (unit: milliseconds)	
HiFreqMean: within-individual mean maximum note frequency (unit: hertz)
SpaceLenMean: within-individual mean note interval (unit: milliseconds)
PC1: individual vocal PC1, the first axis of song PCA
PC2: individual vocal PC2, the second axis of song PCA

File
"individual_distances_August_2022.txt": this file contains the individual pairwise genetic distances used in Multiple Matrix Regression with Randomization (MMRRs)

File
"Variables_dist_colum.csv": This file contains the data from pairwise matrices used in Figures S11-S16
Variable list:
indivx: identification number for the pairwise match with individual "indivy" for each variable
indivy: identification number for the pairwise match with individual "indivx" for each variable	
gen_distance: two-parameter Kimura pairwise individual genetic distances 		
Highfreq_distance: pairwise Euclidean distances of individual maximum note frequency (original unit: hertz)	
pacenote_distance: pairwise Euclidean distances of individual number of notes		
notelength_distance: pairwise Euclidean distances of individual note length (original unit: milliseconds)		
Spacelengthdistance: pairwise Euclidean distances of individual note interval (original unit: milliseconds)		
PC1_distance: pairwise Euclidean distances of individual vocal PC1	
PC2_distance: pairwise Euclidean distances of individual vocal PC2	
NDVI_distance: pairwise Euclidean distances of normalized difference vegetation index among sampling sites		
temperature_distance: pairwise Euclidean distances of temperature among sampling sites (original unit: Degree Celsius)		
Pcwingtarsus_distance: pairwise Euclidean distances of mean body size PC1
Qvalueatra_distance: pairwise Euclidean distances of individual coefficient of ancestry			
Geo_distance_km: pairwise geographic distances (unit: kilometers) among sampling sites		
billlength_distance: pairwise Euclidean distances of mean bill length (original unit: millimeters) among sampling sites	

File
"leucoptera_notelength_vs_distHZ.csv": file contains data for figure S18
Variable list:
Species: species identification
CutNumber: individual identification of each recorded song	
SiteName: sampling site	
DistrefHZnegat: geographic distance (unit: kilometers) of each sampling site to the hybrid zone	
NoteLenMean: within-individual mean note length (unit: milliseconds)

File
" atra_notelength_vs_distHZ": file contains data for figure S19
Variable list:
Species: species identification
CutNumber: individual identification of each recorded song	
SiteName: sampling site 
DistrefHZ: geographic distance (unit: kilometers) of each sampling site to the hybrid zone
NoteLenMean: within-individual mean note length (unit: milliseconds)



